Skip to content

Conversation

@gsmet
Copy link
Contributor

@gsmet gsmet commented Aug 15, 2025

I have been working on a very large monolith using Hibernate ORM and these particular methods were quite high in the profiling I made.

Using this simple trick reduces things quite a lot, for what I think is an acceptable trade-off of the code looking a bit old school.

FWIW, we have been fighting against iterator() in Quarkus in critical hot paths as it generates more allocations than simply going through the list.

What is specific here is that size() is also quite slow in some cases, and it's also called by hasNext().

This is before:

Screenshot From 2025-08-15 15-46-38

Zoom on the interesting part:

Screenshot From 2025-08-15 15-46-53

This is after:

Screenshot From 2025-08-15 15-46-44

Zoom on the interesting part:

Screenshot From 2025-08-15 15-47-02

I have been working on a very large monolith using Hibernate ORM and
these particular methods were quite high in the profiling I made.

Using this simple trick reduces things quite a lot, for what I think is
an acceptable trade-off of the code looking a bit old school.

FWIW, we have been fighting against iterator() in Quarkus in critical
hot paths as it generates more allocations than simply going through the
list.

What is specific here is that size() is also quite slow in some cases,
and it's also called by hasNext().
@gsmet
Copy link
Contributor Author

gsmet commented Aug 15, 2025

👋 @raphw not sure if you remember me, we interacted a bit a few years back when you rewrote the Hibernate enhancer to ByteBuddy.

Let me know what you think of this patch, happy to adjust if needed.

@gsmet
Copy link
Contributor Author

gsmet commented Aug 15, 2025

FWIW, if you want to reproduce, I'm using a benchmark I can share, it's just a bunch of generated entities in a Quarkus application.

I have scripts there to get CPU profiles and allocations profiles.

@raphw
Copy link
Owner

raphw commented Aug 15, 2025

Of course I remember!

I am curious: is it necessary to both avoid the iterator, and to avoid the call to "size"? I would expect iterators to normally be optimized by the JIT if they lie ion a critical code path. Generally speaking, I am happy to optimize here.

@gsmet
Copy link
Contributor Author

gsmet commented Aug 15, 2025

So, the size() makes it better all along. Avoiding the iterator() is just to avoid some additional allocations, given it becomes quite a hot path for us. Our experience in Quarkus is that it's good to avoid them in hot paths (but we still use them a lot elsewhere of course).

When we have a lot of entities, allocations are really in the way.

FYI, on my benchmarks with this generated application with 4000 entities, the Hibernate ORM enhancement + proxy generation with ByteBuddy is taking 40% of the CPU and allocations of the build.

This change only helps a bit and it was an easy one. For the rest, things are a lot more complex and I haven't yet figured out how to improve things in a meaningful way.

@raphw raphw merged commit 0aa4d9a into raphw:master Aug 15, 2025
@raphw
Copy link
Owner

raphw commented Aug 15, 2025

If you measured, I am happy. I still made some changes, for example computing the size lazily. Could you check that the performance improvements are retained by this change?

@gsmet
Copy link
Contributor Author

gsmet commented Aug 15, 2025

I'll have a look tomorrow!

@gsmet
Copy link
Contributor Author

gsmet commented Aug 16, 2025

@raphw from what I can see, things look fine after your patch. Thanks.

I had another small question: if I have a TypePool with some caching, are the types updated when I update the classes?

Basically, my question is that we are doing some operations in two steps and I would like to know if I can reuse the type pool from the first steps (I need to make sure the changes mades at the previous step are visible to the second step).

That would be my expectation but I want to make sure it's true.

@raphw
Copy link
Owner

raphw commented Aug 16, 2025

I refactored a little more as the initialization value of the field would be 0 and there is technically no guarantee that the non-final initializer value is set in a different thread, so this is more correct in the framework of the memory model, but it should work just fine.

As for the TypePool: No, there is no automatism in this, but you could update the Cache in a listener. The listener receives the DynamicType from which you can read a type description in the current state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants